Overview

Dataset statistics

Number of variables34
Number of observations532614
Missing cells2063675
Missing cells (%)11.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory103.6 MiB
Average record size in memory204.0 B

Variable types

Numeric10
Categorical21
Text1
DateTime1
Boolean1

Alerts

city_id_old has constant value "C006"Constant
country_id has constant value "Turkey"Constant
city_code has constant value "Konya"Constant
hierarchy3_id has a high cardinality: 77 distinct valuesHigh cardinality
hierarchy4_id has a high cardinality: 151 distinct valuesHigh cardinality
hierarchy5_id has a high cardinality: 292 distinct valuesHigh cardinality
Unnamed: 0 is highly overall correlated with store_id and 2 other fieldsHigh correlation
cluster_id is highly overall correlated with hierarchy3_idHigh correlation
hierarchy1_id is highly overall correlated with hierarchy2_id and 3 other fieldsHigh correlation
hierarchy2_id is highly overall correlated with hierarchy1_id and 4 other fieldsHigh correlation
hierarchy3_id is highly overall correlated with cluster_id and 5 other fieldsHigh correlation
holiday is highly overall correlated with weekdayHigh correlation
month_name is highly overall correlated with promo_discount_type_2 and 1 other fieldsHigh correlation
price is highly overall correlated with promo_bin_2 and 2 other fieldsHigh correlation
promo_bin_1 is highly overall correlated with promo_bin_2 and 2 other fieldsHigh correlation
promo_bin_2 is highly overall correlated with hierarchy2_id and 8 other fieldsHigh correlation
promo_discount_2 is highly overall correlated with hierarchy1_id and 8 other fieldsHigh correlation
promo_discount_type_2 is highly overall correlated with hierarchy1_id and 9 other fieldsHigh correlation
promo_type_1 is highly overall correlated with promo_bin_2High correlation
promo_type_2 is highly overall correlated with promo_bin_2 and 2 other fieldsHigh correlation
revenue is highly overall correlated with salesHigh correlation
sales is highly overall correlated with revenueHigh correlation
season is highly overall correlated with month_name and 4 other fieldsHigh correlation
store_id is highly overall correlated with Unnamed: 0 and 2 other fieldsHigh correlation
store_size is highly overall correlated with Unnamed: 0 and 2 other fieldsHigh correlation
storetype_id is highly overall correlated with Unnamed: 0 and 2 other fieldsHigh correlation
week is highly overall correlated with seasonHigh correlation
weekday is highly overall correlated with holidayHigh correlation
promo_type_1 is highly imbalanced (76.6%)Imbalance
promo_type_2 is highly imbalanced (99.4%)Imbalance
promo_bin_1 has 456834 (85.8%) missing valuesMissing
promo_bin_2 has 532135 (99.9%) missing valuesMissing
promo_discount_2 has 532135 (99.9%) missing valuesMissing
promo_discount_type_2 has 532135 (99.9%) missing valuesMissing
sales is highly skewed (γ1 = 37.03501339)Skewed
revenue is highly skewed (γ1 = 224.1724057)Skewed
Unnamed: 0 has unique valuesUnique
sales has 461233 (86.6%) zerosZeros
revenue has 461290 (86.6%) zerosZeros

Reproduction

Analysis started2024-07-11 16:14:19.912484
Analysis finished2024-07-11 16:19:37.399884
Duration5 minutes and 17.49 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct532614
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6388555.1
Minimum1793963
Maximum8519019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:37.571144image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1793963
5-th percentile1820593.6
Q15840915.2
median5974068.5
Q38385865.8
95-th percentile8492388.3
Maximum8519019
Range6725056
Interquartile range (IQR)2544950.5

Descriptive statistics

Standard deviation2029951
Coefficient of variation (CV)0.31774806
Kurtosis0.36080466
Mean6388555.1
Median Absolute Deviation (MAD)203453
Skewness-0.94877514
Sum3.4026339 × 1012
Variance4.1207011 × 1012
MonotonicityStrictly increasing
2024-07-11T16:19:37.852939image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1793963 1
 
< 0.1%
8341492 1
 
< 0.1%
8341490 1
 
< 0.1%
8341489 1
 
< 0.1%
8341488 1
 
< 0.1%
8341487 1
 
< 0.1%
8341486 1
 
< 0.1%
8341485 1
 
< 0.1%
8341484 1
 
< 0.1%
8341483 1
 
< 0.1%
Other values (532604) 532604
> 99.9%
ValueCountFrequency (%)
1793963 1
< 0.1%
1793964 1
< 0.1%
1793965 1
< 0.1%
1793966 1
< 0.1%
1793967 1
< 0.1%
1793968 1
< 0.1%
1793969 1
< 0.1%
1793970 1
< 0.1%
1793971 1
< 0.1%
1793972 1
< 0.1%
ValueCountFrequency (%)
8519019 1
< 0.1%
8519018 1
< 0.1%
8519017 1
< 0.1%
8519016 1
< 0.1%
8519015 1
< 0.1%
8519014 1
< 0.1%
8519013 1
< 0.1%
8519012 1
< 0.1%
8519011 1
< 0.1%
8519010 1
< 0.1%

store_id
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
S0094
267115 
S0142
203453 
S0030
62046 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters2663070
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS0030
2nd rowS0030
3rd rowS0030
4th rowS0030
5th rowS0030

Common Values

ValueCountFrequency (%)
S0094 267115
50.2%
S0142 203453
38.2%
S0030 62046
 
11.6%

Length

2024-07-11T16:19:38.128045image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:38.401098image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
s0094 267115
50.2%
s0142 203453
38.2%
s0030 62046
 
11.6%

Most occurring characters

ValueCountFrequency (%)
0 923821
34.7%
S 532614
20.0%
4 470568
17.7%
9 267115
 
10.0%
1 203453
 
7.6%
2 203453
 
7.6%
3 62046
 
2.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 923821
34.7%
S 532614
20.0%
4 470568
17.7%
9 267115
 
10.0%
1 203453
 
7.6%
2 203453
 
7.6%
3 62046
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 923821
34.7%
S 532614
20.0%
4 470568
17.7%
9 267115
 
10.0%
1 203453
 
7.6%
2 203453
 
7.6%
3 62046
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 923821
34.7%
S 532614
20.0%
4 470568
17.7%
9 267115
 
10.0%
1 203453
 
7.6%
2 203453
 
7.6%
3 62046
 
2.3%
Distinct480
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:38.937084image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters2663070
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP0015
2nd rowP0018
3rd rowP0035
4th rowP0051
5th rowP0055
ValueCountFrequency (%)
p0453 3004
 
0.6%
p0125 3003
 
0.6%
p0536 3002
 
0.6%
p0015 3001
 
0.6%
p0325 2993
 
0.6%
p0664 2992
 
0.6%
p0372 2984
 
0.6%
p0055 2981
 
0.6%
p0348 2980
 
0.6%
p0364 2978
 
0.6%
Other values (470) 502696
94.4%
2024-07-11T16:19:39.764904image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 703072
26.4%
P 532614
20.0%
1 200291
 
7.5%
6 182000
 
6.8%
5 181108
 
6.8%
2 180972
 
6.8%
4 177471
 
6.7%
3 172096
 
6.5%
7 139061
 
5.2%
9 101551
 
3.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 703072
26.4%
P 532614
20.0%
1 200291
 
7.5%
6 182000
 
6.8%
5 181108
 
6.8%
2 180972
 
6.8%
4 177471
 
6.7%
3 172096
 
6.5%
7 139061
 
5.2%
9 101551
 
3.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 703072
26.4%
P 532614
20.0%
1 200291
 
7.5%
6 182000
 
6.8%
5 181108
 
6.8%
2 180972
 
6.8%
4 177471
 
6.7%
3 172096
 
6.5%
7 139061
 
5.2%
9 101551
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 703072
26.4%
P 532614
20.0%
1 200291
 
7.5%
6 182000
 
6.8%
5 181108
 
6.8%
2 180972
 
6.8%
4 177471
 
6.7%
3 172096
 
6.5%
7 139061
 
5.2%
9 101551
 
3.8%

date
Date

Distinct1002
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
Minimum2017-01-02 00:00:00
Maximum2019-09-30 00:00:00
2024-07-11T16:19:40.087270image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:40.355392image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

sales
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct510
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.33648231
Minimum0
Maximum301
Zeros461233
Zeros (%)86.6%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:40.896622image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum301
Range301
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.8039157
Coefficient of variation (CV)5.3611011
Kurtosis4020.8841
Mean0.33648231
Median Absolute Deviation (MAD)0
Skewness37.035013
Sum179215.19
Variance3.2541119
MonotonicityNot monotonic
2024-07-11T16:19:41.165289image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 461233
86.6%
1 40236
 
7.6%
2 13562
 
2.5%
3 5585
 
1.0%
4 3363
 
0.6%
5 1987
 
0.4%
6 1401
 
0.3%
7 923
 
0.2%
8 703
 
0.1%
9 533
 
0.1%
Other values (500) 3088
 
0.6%
ValueCountFrequency (%)
0 461233
86.6%
0.048 1
 
< 0.1%
0.064 3
 
< 0.1%
0.074 1
 
< 0.1%
0.078 1
 
< 0.1%
0.092 1
 
< 0.1%
0.096 2
 
< 0.1%
0.1 2
 
< 0.1%
0.108 1
 
< 0.1%
0.11 1
 
< 0.1%
ValueCountFrequency (%)
301 1
< 0.1%
300 1
< 0.1%
215 1
< 0.1%
193 1
< 0.1%
160 1
< 0.1%
156 1
< 0.1%
113 1
< 0.1%
95 1
< 0.1%
86 2
< 0.1%
82 1
< 0.1%

revenue
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct2685
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4032907
Minimum0
Maximum5879.35
Zeros461290
Zeros (%)86.6%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:41.447534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile7.87
Maximum5879.35
Range5879.35
Interquartile range (IQR)0

Descriptive statistics

Standard deviation14.525375
Coefficient of variation (CV)10.350939
Kurtosis77171.231
Mean1.4032907
Median Absolute Deviation (MAD)0
Skewness224.17241
Sum747412.25
Variance210.98653
MonotonicityNot monotonic
2024-07-11T16:19:41.711527image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 461290
86.6%
0.93 1777
 
0.3%
3.24 1530
 
0.3%
1.85 1364
 
0.3%
2.78 1255
 
0.2%
2.31 1246
 
0.2%
3.7 925
 
0.2%
1.39 867
 
0.2%
1.16 835
 
0.2%
6.48 812
 
0.2%
Other values (2675) 60713
 
11.4%
ValueCountFrequency (%)
0 461290
86.6%
0.01 6
 
< 0.1%
0.02 3
 
< 0.1%
0.23 31
 
< 0.1%
0.31 2
 
< 0.1%
0.36 1
 
< 0.1%
0.42 119
 
< 0.1%
0.46 225
 
< 0.1%
0.47 1
 
< 0.1%
0.51 1
 
< 0.1%
ValueCountFrequency (%)
5879.35 1
< 0.1%
4874.07 1
< 0.1%
2101.94 1
< 0.1%
1968.94 1
< 0.1%
1811.57 1
< 0.1%
1596.1 1
< 0.1%
1347.5 1
< 0.1%
1303.38 1
< 0.1%
1270.34 1
< 0.1%
1256.14 1
< 0.1%

stock
Real number (ℝ)

Distinct856
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.629129
Minimum0
Maximum2700
Zeros2219
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:41.981540image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median8
Q316
95-th percentile50
Maximum2700
Range2700
Interquartile range (IQR)12

Descriptive statistics

Standard deviation29.790146
Coefficient of variation (CV)1.9060657
Kurtosis806.67867
Mean15.629129
Median Absolute Deviation (MAD)5
Skewness14.67625
Sum8324292.8
Variance887.45277
MonotonicityNot monotonic
2024-07-11T16:19:42.263373image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 40463
 
7.6%
4 39627
 
7.4%
5 37232
 
7.0%
3 36774
 
6.9%
2 32793
 
6.2%
1 28817
 
5.4%
7 28326
 
5.3%
8 27492
 
5.2%
12 24722
 
4.6%
9 23092
 
4.3%
Other values (846) 213276
40.0%
ValueCountFrequency (%)
0 2219
0.4%
0.384 2
 
< 0.1%
0.415 13
 
< 0.1%
0.464 1
 
< 0.1%
0.515 1
 
< 0.1%
0.705 3
 
< 0.1%
0.785 1
 
< 0.1%
0.832 3
 
< 0.1%
0.93 5
 
< 0.1%
0.931 1
 
< 0.1%
ValueCountFrequency (%)
2700 1
< 0.1%
2683 1
< 0.1%
2644 1
< 0.1%
2574 1
< 0.1%
2560 1
< 0.1%
2512 1
< 0.1%
2399 1
< 0.1%
1253 1
< 0.1%
1175 1
< 0.1%
1080 1
< 0.1%

price
Real number (ℝ)

HIGH CORRELATION 

Distinct443
Distinct (%)0.1%
Missing1012
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean16.33836
Minimum0.01
Maximum1599
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:42.541121image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile1
Q13.5
median8.65
Q317.99
95-th percentile59.9
Maximum1599
Range1598.99
Interquartile range (IQR)14.49

Descriptive statistics

Standard deviation31.232437
Coefficient of variation (CV)1.9116017
Kurtosis561.88314
Mean16.33836
Median Absolute Deviation (MAD)6.05
Skewness16.528175
Sum8685505.1
Variance975.46512
MonotonicityNot monotonic
2024-07-11T16:19:42.947870image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 12394
 
2.3%
3.5 10446
 
2.0%
3.95 8869
 
1.7%
19.9 8684
 
1.6%
11.9 8324
 
1.6%
0.75 7556
 
1.4%
2.95 7296
 
1.4%
29.9 7156
 
1.3%
12.9 7099
 
1.3%
4.9 6908
 
1.3%
Other values (433) 446870
83.9%
ValueCountFrequency (%)
0.01 13
 
< 0.1%
0.25 466
 
0.1%
0.3 6
 
< 0.1%
0.35 4
 
< 0.1%
0.4 6
 
< 0.1%
0.45 664
 
0.1%
0.5 3240
0.6%
0.58 26
 
< 0.1%
0.6 1471
0.3%
0.65 3059
0.6%
ValueCountFrequency (%)
1599 15
 
< 0.1%
1549 1
 
< 0.1%
1499 2
 
< 0.1%
1449 3
 
< 0.1%
1399 13
 
< 0.1%
1349 18
 
< 0.1%
699 185
< 0.1%
679 22
 
< 0.1%
655 24
 
< 0.1%
599 28
 
< 0.1%

promo_type_1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size520.9 KiB
PR14
456834 
PR05
 
35490
PR10
 
12782
PR03
 
9071
PR06
 
7590
Other values (11)
 
10847

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2130456
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPR14
2nd rowPR14
3rd rowPR14
4th rowPR14
5th rowPR05

Common Values

ValueCountFrequency (%)
PR14 456834
85.8%
PR05 35490
 
6.7%
PR10 12782
 
2.4%
PR03 9071
 
1.7%
PR06 7590
 
1.4%
PR07 3111
 
0.6%
PR12 2353
 
0.4%
PR09 1963
 
0.4%
PR17 1890
 
0.4%
PR01 654
 
0.1%
Other values (6) 876
 
0.2%

Length

2024-07-11T16:19:43.362153image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pr14 456834
85.8%
pr05 35490
 
6.7%
pr10 12782
 
2.4%
pr03 9071
 
1.7%
pr06 7590
 
1.4%
pr07 3111
 
0.6%
pr12 2353
 
0.4%
pr09 1963
 
0.4%
pr17 1890
 
0.4%
pr01 654
 
0.1%
Other values (6) 876
 
0.2%

Most occurring characters

ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
1 475201
22.3%
4 457023
21.5%
0 71105
 
3.3%
5 35490
 
1.7%
3 9101
 
0.4%
6 7638
 
0.4%
7 5001
 
0.2%
2 2353
 
0.1%
Other values (2) 2316
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
1 475201
22.3%
4 457023
21.5%
0 71105
 
3.3%
5 35490
 
1.7%
3 9101
 
0.4%
6 7638
 
0.4%
7 5001
 
0.2%
2 2353
 
0.1%
Other values (2) 2316
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
1 475201
22.3%
4 457023
21.5%
0 71105
 
3.3%
5 35490
 
1.7%
3 9101
 
0.4%
6 7638
 
0.4%
7 5001
 
0.2%
2 2353
 
0.1%
Other values (2) 2316
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
1 475201
22.3%
4 457023
21.5%
0 71105
 
3.3%
5 35490
 
1.7%
3 9101
 
0.4%
6 7638
 
0.4%
7 5001
 
0.2%
2 2353
 
0.1%
Other values (2) 2316
 
0.1%

promo_bin_1
Categorical

HIGH CORRELATION  MISSING 

Distinct5
Distinct (%)< 0.1%
Missing456834
Missing (%)85.8%
Memory size520.5 KiB
verylow
30355 
low
15786 
moderate
12351 
high
9385 
veryhigh
7903 

Length

Max length8
Median length7
Mean length6.0624835
Min length3

Characters and Unicode

Total characters459415
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowverylow
2nd rowverylow
3rd rowverylow
4th rowlow
5th rowmoderate

Common Values

ValueCountFrequency (%)
verylow 30355
 
5.7%
low 15786
 
3.0%
moderate 12351
 
2.3%
high 9385
 
1.8%
veryhigh 7903
 
1.5%
(Missing) 456834
85.8%

Length

2024-07-11T16:19:43.706518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:44.181477image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
verylow 30355
40.1%
low 15786
20.8%
moderate 12351
16.3%
high 9385
 
12.4%
veryhigh 7903
 
10.4%

Most occurring characters

ValueCountFrequency (%)
e 62960
13.7%
o 58492
12.7%
r 50609
11.0%
l 46141
10.0%
w 46141
10.0%
v 38258
8.3%
y 38258
8.3%
h 34576
7.5%
i 17288
 
3.8%
g 17288
 
3.8%
Other values (4) 49404
10.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 459415
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 62960
13.7%
o 58492
12.7%
r 50609
11.0%
l 46141
10.0%
w 46141
10.0%
v 38258
8.3%
y 38258
8.3%
h 34576
7.5%
i 17288
 
3.8%
g 17288
 
3.8%
Other values (4) 49404
10.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 459415
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 62960
13.7%
o 58492
12.7%
r 50609
11.0%
l 46141
10.0%
w 46141
10.0%
v 38258
8.3%
y 38258
8.3%
h 34576
7.5%
i 17288
 
3.8%
g 17288
 
3.8%
Other values (4) 49404
10.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 459415
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 62960
13.7%
o 58492
12.7%
r 50609
11.0%
l 46141
10.0%
w 46141
10.0%
v 38258
8.3%
y 38258
8.3%
h 34576
7.5%
i 17288
 
3.8%
g 17288
 
3.8%
Other values (4) 49404
10.8%

promo_type_2
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size520.5 KiB
PR03
532135 
PR02
 
354
PR01
 
121
PR04
 
4

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2130456
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPR03
2nd rowPR03
3rd rowPR03
4th rowPR03
5th rowPR03

Common Values

ValueCountFrequency (%)
PR03 532135
99.9%
PR02 354
 
0.1%
PR01 121
 
< 0.1%
PR04 4
 
< 0.1%

Length

2024-07-11T16:19:44.529658image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:44.894677image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
pr03 532135
99.9%
pr02 354
 
0.1%
pr01 121
 
< 0.1%
pr04 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
0 532614
25.0%
3 532135
25.0%
2 354
 
< 0.1%
1 121
 
< 0.1%
4 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
0 532614
25.0%
3 532135
25.0%
2 354
 
< 0.1%
1 121
 
< 0.1%
4 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
0 532614
25.0%
3 532135
25.0%
2 354
 
< 0.1%
1 121
 
< 0.1%
4 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 532614
25.0%
R 532614
25.0%
0 532614
25.0%
3 532135
25.0%
2 354
 
< 0.1%
1 121
 
< 0.1%
4 4
 
< 0.1%

promo_bin_2
Categorical

HIGH CORRELATION  MISSING 

Distinct3
Distinct (%)0.6%
Missing532135
Missing (%)99.9%
Memory size520.4 KiB
verylow
326 
veryhigh
122 
high
 
31

Length

Max length8
Median length7
Mean length7.0605428
Min length4

Characters and Unicode

Total characters3382
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowverylow
2nd rowverylow
3rd rowverylow
4th rowverylow
5th rowverylow

Common Values

ValueCountFrequency (%)
verylow 326
 
0.1%
veryhigh 122
 
< 0.1%
high 31
 
< 0.1%
(Missing) 532135
99.9%

Length

2024-07-11T16:19:45.290056image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:45.686518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
verylow 326
68.1%
veryhigh 122
 
25.5%
high 31
 
6.5%

Most occurring characters

ValueCountFrequency (%)
v 448
13.2%
e 448
13.2%
r 448
13.2%
y 448
13.2%
l 326
9.6%
o 326
9.6%
w 326
9.6%
h 306
9.0%
i 153
 
4.5%
g 153
 
4.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3382
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
v 448
13.2%
e 448
13.2%
r 448
13.2%
y 448
13.2%
l 326
9.6%
o 326
9.6%
w 326
9.6%
h 306
9.0%
i 153
 
4.5%
g 153
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3382
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
v 448
13.2%
e 448
13.2%
r 448
13.2%
y 448
13.2%
l 326
9.6%
o 326
9.6%
w 326
9.6%
h 306
9.0%
i 153
 
4.5%
g 153
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3382
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
v 448
13.2%
e 448
13.2%
r 448
13.2%
y 448
13.2%
l 326
9.6%
o 326
9.6%
w 326
9.6%
h 306
9.0%
i 153
 
4.5%
g 153
 
4.5%

promo_discount_2
Categorical

HIGH CORRELATION  MISSING 

Distinct5
Distinct (%)1.0%
Missing532135
Missing (%)99.9%
Memory size4.1 MiB
20.0
308 
50.0
122 
35.0
 
28
16.0
 
18
40.0
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1916
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20.0
2nd row20.0
3rd row20.0
4th row20.0
5th row20.0

Common Values

ValueCountFrequency (%)
20.0 308
 
0.1%
50.0 122
 
< 0.1%
35.0 28
 
< 0.1%
16.0 18
 
< 0.1%
40.0 3
 
< 0.1%
(Missing) 532135
99.9%

Length

2024-07-11T16:19:45.970917image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:46.387376image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
20.0 308
64.3%
50.0 122
 
25.5%
35.0 28
 
5.8%
16.0 18
 
3.8%
40.0 3
 
0.6%

Most occurring characters

ValueCountFrequency (%)
0 912
47.6%
. 479
25.0%
2 308
 
16.1%
5 150
 
7.8%
3 28
 
1.5%
1 18
 
0.9%
6 18
 
0.9%
4 3
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1916
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 912
47.6%
. 479
25.0%
2 308
 
16.1%
5 150
 
7.8%
3 28
 
1.5%
1 18
 
0.9%
6 18
 
0.9%
4 3
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1916
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 912
47.6%
. 479
25.0%
2 308
 
16.1%
5 150
 
7.8%
3 28
 
1.5%
1 18
 
0.9%
6 18
 
0.9%
4 3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1916
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 912
47.6%
. 479
25.0%
2 308
 
16.1%
5 150
 
7.8%
3 28
 
1.5%
1 18
 
0.9%
6 18
 
0.9%
4 3
 
0.2%

promo_discount_type_2
Categorical

HIGH CORRELATION  MISSING 

Distinct4
Distinct (%)0.8%
Missing532135
Missing (%)99.9%
Memory size520.5 KiB
PR02
185 
PR04
141 
PR03
109 
PR01
44 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1916
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPR04
2nd rowPR04
3rd rowPR04
4th rowPR04
5th rowPR04

Common Values

ValueCountFrequency (%)
PR02 185
 
< 0.1%
PR04 141
 
< 0.1%
PR03 109
 
< 0.1%
PR01 44
 
< 0.1%
(Missing) 532135
99.9%

Length

2024-07-11T16:19:46.817135image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:47.242606image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
pr02 185
38.6%
pr04 141
29.4%
pr03 109
22.8%
pr01 44
 
9.2%

Most occurring characters

ValueCountFrequency (%)
P 479
25.0%
R 479
25.0%
0 479
25.0%
2 185
 
9.7%
4 141
 
7.4%
3 109
 
5.7%
1 44
 
2.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1916
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 479
25.0%
R 479
25.0%
0 479
25.0%
2 185
 
9.7%
4 141
 
7.4%
3 109
 
5.7%
1 44
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1916
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 479
25.0%
R 479
25.0%
0 479
25.0%
2 185
 
9.7%
4 141
 
7.4%
3 109
 
5.7%
1 44
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1916
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 479
25.0%
R 479
25.0%
0 479
25.0%
2 185
 
9.7%
4 141
 
7.4%
3 109
 
5.7%
1 44
 
2.3%

product_length
Real number (ℝ)

Distinct108
Distinct (%)< 0.1%
Missing3158
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean6.5208278
Minimum0
Maximum100
Zeros598
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:47.485269image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12.8
median5
Q37.5
95-th percentile20
Maximum100
Range100
Interquartile range (IQR)4.7

Descriptive statistics

Standard deviation6.6375649
Coefficient of variation (CV)1.0179022
Kurtosis46.2261
Mean6.5208278
Median Absolute Deviation (MAD)2.5
Skewness4.8038105
Sum3452491.4
Variance44.057267
MonotonicityNot monotonic
2024-07-11T16:19:47.771821image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 52699
 
9.9%
1 38538
 
7.2%
5 37524
 
7.0%
6 32373
 
6.1%
4.5 32087
 
6.0%
3 24588
 
4.6%
4 22580
 
4.2%
7 17071
 
3.2%
6.5 12662
 
2.4%
7.5 11537
 
2.2%
Other values (98) 247797
46.5%
ValueCountFrequency (%)
0 598
 
0.1%
0.3 376
 
0.1%
0.5 2914
 
0.5%
1 38538
7.2%
1.5 8186
 
1.5%
1.6 2570
 
0.5%
1.7 5622
 
1.1%
1.8 2179
 
0.4%
2 52699
9.9%
2.1 524
 
0.1%
ValueCountFrequency (%)
100 545
 
0.1%
59 215
 
< 0.1%
47.8 678
 
0.1%
44 72
 
< 0.1%
40.6 258
 
< 0.1%
40 1262
0.2%
33 300
 
0.1%
30 2063
0.4%
29.3 162
 
< 0.1%
28 1918
0.4%

product_depth
Real number (ℝ)

Distinct139
Distinct (%)< 0.1%
Missing3133
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean17.636908
Minimum0
Maximum160
Zeros598
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:48.050623image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q111.1
median17
Q322.5
95-th percentile33
Maximum160
Range160
Interquartile range (IQR)11.4

Descriptive statistics

Standard deviation11.421757
Coefficient of variation (CV)0.64760542
Kurtosis33.694102
Mean17.636908
Median Absolute Deviation (MAD)5.9
Skewness3.8725702
Sum9338407.5
Variance130.45653
MonotonicityNot monotonic
2024-07-11T16:19:48.327802image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17 19431
 
3.6%
25 17363
 
3.3%
20 16765
 
3.1%
15 16733
 
3.1%
12 16475
 
3.1%
18 15680
 
2.9%
16 15285
 
2.9%
4 15065
 
2.8%
23 14713
 
2.8%
14 14259
 
2.7%
Other values (129) 367712
69.0%
ValueCountFrequency (%)
0 598
 
0.1%
1 1089
 
0.2%
1.5 503
 
0.1%
2 140
 
< 0.1%
3 8967
1.7%
3.5 343
 
0.1%
3.8 1642
 
0.3%
4 15065
2.8%
4.5 6848
1.3%
4.8 202
 
< 0.1%
ValueCountFrequency (%)
160 545
 
0.1%
100 1262
0.2%
88 287
 
0.1%
80 678
 
0.1%
77 1409
0.3%
56 1714
0.3%
55 491
 
0.1%
48 1776
0.3%
47 625
 
0.1%
45.7 1348
0.3%

product_width
Real number (ℝ)

Distinct123
Distinct (%)< 0.1%
Missing3133
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean12.352003
Minimum0
Maximum100
Zeros598
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:48.616659image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q17.5
median10
Q315
95-th percentile30
Maximum100
Range100
Interquartile range (IQR)7.5

Descriptive statistics

Standard deviation8.1855113
Coefficient of variation (CV)0.66268697
Kurtosis16.339503
Mean12.352003
Median Absolute Deviation (MAD)3
Skewness2.7833404
Sum6540150.8
Variance67.002596
MonotonicityNot monotonic
2024-07-11T16:19:48.909698image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 37446
 
7.0%
9 31888
 
6.0%
8 24807
 
4.7%
13 24092
 
4.5%
11 20271
 
3.8%
7 19307
 
3.6%
12 17634
 
3.3%
16 15083
 
2.8%
4.5 14395
 
2.7%
7.5 14050
 
2.6%
Other values (113) 310508
58.3%
ValueCountFrequency (%)
0 598
 
0.1%
1 1783
 
0.3%
2 6009
1.1%
2.5 2984
0.6%
3 6546
1.2%
3.3 1002
 
0.2%
3.5 771
 
0.1%
3.6 1827
 
0.3%
3.8 718
 
0.1%
4 5184
1.0%
ValueCountFrequency (%)
100 545
 
0.1%
59 215
 
< 0.1%
50 1262
0.2%
47.8 678
 
0.1%
47 72
 
< 0.1%
44.5 27
 
< 0.1%
44.4 2899
0.5%
40.5 563
 
0.1%
40 758
 
0.1%
38 1259
0.2%

cluster_id
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
cluster_0
311482 
cluster_9
44791 
cluster_4
36893 
cluster_3
35911 
cluster_6
 
24874
Other values (5)
78663 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters4793526
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcluster_1
2nd rowcluster_4
3rd rowcluster_7
4th rowcluster_7
5th rowcluster_0

Common Values

ValueCountFrequency (%)
cluster_0 311482
58.5%
cluster_9 44791
 
8.4%
cluster_4 36893
 
6.9%
cluster_3 35911
 
6.7%
cluster_6 24874
 
4.7%
cluster_8 22275
 
4.2%
cluster_7 16694
 
3.1%
cluster_5 15773
 
3.0%
cluster_2 13890
 
2.6%
cluster_1 10031
 
1.9%

Length

2024-07-11T16:19:49.156643image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:49.423538image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
cluster_0 311482
58.5%
cluster_9 44791
 
8.4%
cluster_4 36893
 
6.9%
cluster_3 35911
 
6.7%
cluster_6 24874
 
4.7%
cluster_8 22275
 
4.2%
cluster_7 16694
 
3.1%
cluster_5 15773
 
3.0%
cluster_2 13890
 
2.6%
cluster_1 10031
 
1.9%

Most occurring characters

ValueCountFrequency (%)
c 532614
11.1%
l 532614
11.1%
u 532614
11.1%
s 532614
11.1%
t 532614
11.1%
e 532614
11.1%
r 532614
11.1%
_ 532614
11.1%
0 311482
6.5%
9 44791
 
0.9%
Other values (8) 176341
 
3.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4793526
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
c 532614
11.1%
l 532614
11.1%
u 532614
11.1%
s 532614
11.1%
t 532614
11.1%
e 532614
11.1%
r 532614
11.1%
_ 532614
11.1%
0 311482
6.5%
9 44791
 
0.9%
Other values (8) 176341
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4793526
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
c 532614
11.1%
l 532614
11.1%
u 532614
11.1%
s 532614
11.1%
t 532614
11.1%
e 532614
11.1%
r 532614
11.1%
_ 532614
11.1%
0 311482
6.5%
9 44791
 
0.9%
Other values (8) 176341
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4793526
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
c 532614
11.1%
l 532614
11.1%
u 532614
11.1%
s 532614
11.1%
t 532614
11.1%
e 532614
11.1%
r 532614
11.1%
_ 532614
11.1%
0 311482
6.5%
9 44791
 
0.9%
Other values (8) 176341
 
3.7%

hierarchy1_id
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size520.5 KiB
H00
224943 
H01
165294 
H03
141404 
H02
 
973

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1597842
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowH00
2nd rowH00
3rd rowH00
4th rowH00
5th rowH00

Common Values

ValueCountFrequency (%)
H00 224943
42.2%
H01 165294
31.0%
H03 141404
26.5%
H02 973
 
0.2%

Length

2024-07-11T16:19:49.683183image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:49.938376image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
h00 224943
42.2%
h01 165294
31.0%
h03 141404
26.5%
h02 973
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0 757557
47.4%
H 532614
33.3%
1 165294
 
10.3%
3 141404
 
8.8%
2 973
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 757557
47.4%
H 532614
33.3%
1 165294
 
10.3%
3 141404
 
8.8%
2 973
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 757557
47.4%
H 532614
33.3%
1 165294
 
10.3%
3 141404
 
8.8%
2 973
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 757557
47.4%
H 532614
33.3%
1 165294
 
10.3%
3 141404
 
8.8%
2 973
 
0.1%

hierarchy2_id
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size520.9 KiB
H0108
84549 
H0003
77588 
H0002
55668 
H0313
53774 
H0000
37446 
Other values (13)
223589 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters2663070
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowH0000
2nd rowH0003
3rd rowH0003
4th rowH0003
5th rowH0003

Common Values

ValueCountFrequency (%)
H0108 84549
15.9%
H0003 77588
14.6%
H0002 55668
10.5%
H0313 53774
10.1%
H0000 37446
7.0%
H0312 36916
6.9%
H0001 35292
6.6%
H0106 34969
6.6%
H0107 29867
 
5.6%
H0314 19706
 
3.7%
Other values (8) 66839
12.5%

Length

2024-07-11T16:19:50.147272image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
h0108 84549
15.9%
h0003 77588
14.6%
h0002 55668
10.5%
h0313 53774
10.1%
h0000 37446
7.0%
h0312 36916
6.9%
h0001 35292
6.6%
h0106 34969
6.6%
h0107 29867
 
5.6%
h0314 19706
 
3.7%
Other values (8) 66839
12.5%

Most occurring characters

ValueCountFrequency (%)
0 1186213
44.5%
H 532614
20.0%
1 355666
 
13.4%
3 272766
 
10.2%
2 93557
 
3.5%
8 84549
 
3.2%
4 38655
 
1.5%
6 36552
 
1.4%
7 34513
 
1.3%
5 27684
 
1.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1186213
44.5%
H 532614
20.0%
1 355666
 
13.4%
3 272766
 
10.2%
2 93557
 
3.5%
8 84549
 
3.2%
4 38655
 
1.5%
6 36552
 
1.4%
7 34513
 
1.3%
5 27684
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1186213
44.5%
H 532614
20.0%
1 355666
 
13.4%
3 272766
 
10.2%
2 93557
 
3.5%
8 84549
 
3.2%
4 38655
 
1.5%
6 36552
 
1.4%
7 34513
 
1.3%
5 27684
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1186213
44.5%
H 532614
20.0%
1 355666
 
13.4%
3 272766
 
10.2%
2 93557
 
3.5%
8 84549
 
3.2%
4 38655
 
1.5%
6 36552
 
1.4%
7 34513
 
1.3%
5 27684
 
1.0%

hierarchy3_id
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct77
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size522.9 KiB
H000312
 
31981
H010601
 
26643
H010807
 
24152
H000004
 
22336
H000200
 
20249
Other values (72)
407253 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters3728298
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowH000003
2nd rowH000316
3rd rowH000311
4th rowH000314
5th rowH000311

Common Values

ValueCountFrequency (%)
H000312 31981
 
6.0%
H010601 26643
 
5.0%
H010807 24152
 
4.5%
H000004 22336
 
4.2%
H000200 20249
 
3.8%
H010805 19395
 
3.6%
H031302 16777
 
3.1%
H000316 16696
 
3.1%
H000201 16464
 
3.1%
H000102 14183
 
2.7%
Other values (67) 323738
60.8%

Length

2024-07-11T16:19:50.355862image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
h000312 31981
 
6.0%
h010601 26643
 
5.0%
h010807 24152
 
4.5%
h000004 22336
 
4.2%
h000200 20249
 
3.8%
h010805 19395
 
3.6%
h031302 16777
 
3.1%
h000316 16696
 
3.1%
h000201 16464
 
3.1%
h000102 14183
 
2.7%
Other values (67) 323738
60.8%

Most occurring characters

ValueCountFrequency (%)
0 1679772
45.1%
1 574386
 
15.4%
H 532614
 
14.3%
3 306925
 
8.2%
2 182293
 
4.9%
8 106394
 
2.9%
5 91953
 
2.5%
4 81151
 
2.2%
7 79688
 
2.1%
6 73400
 
2.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3728298
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1679772
45.1%
1 574386
 
15.4%
H 532614
 
14.3%
3 306925
 
8.2%
2 182293
 
4.9%
8 106394
 
2.9%
5 91953
 
2.5%
4 81151
 
2.2%
7 79688
 
2.1%
6 73400
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3728298
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1679772
45.1%
1 574386
 
15.4%
H 532614
 
14.3%
3 306925
 
8.2%
2 182293
 
4.9%
8 106394
 
2.9%
5 91953
 
2.5%
4 81151
 
2.2%
7 79688
 
2.1%
6 73400
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3728298
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1679772
45.1%
1 574386
 
15.4%
H 532614
 
14.3%
3 306925
 
8.2%
2 182293
 
4.9%
8 106394
 
2.9%
5 91953
 
2.5%
4 81151
 
2.2%
7 79688
 
2.1%
6 73400
 
2.0%

hierarchy4_id
Categorical

HIGH CARDINALITY 

Distinct151
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
H00031200
 
25007
H01080500
 
14967
H00010210
 
14183
H00000405
 
13846
H01060113
 
13047
Other values (146)
451564 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters4793526
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowH00000309
2nd rowH00031609
3rd rowH00031100
4th rowH00031409
5th rowH00031109

Common Values

ValueCountFrequency (%)
H00031200 25007
 
4.7%
H01080500 14967
 
2.8%
H00010210 14183
 
2.7%
H00000405 13846
 
2.6%
H01060113 13047
 
2.4%
H00020000 12368
 
2.3%
H00031609 11845
 
2.2%
H01080900 8911
 
1.7%
H03130700 8843
 
1.7%
H01080709 8826
 
1.7%
Other values (141) 400771
75.2%

Length

2024-07-11T16:19:50.566237image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
h00031200 25007
 
4.7%
h01080500 14967
 
2.8%
h00010210 14183
 
2.7%
h00000405 13846
 
2.6%
h01060113 13047
 
2.4%
h00020000 12368
 
2.3%
h00031609 11845
 
2.2%
h01080900 8911
 
1.7%
h03130700 8843
 
1.7%
h01080709 8826
 
1.7%
Other values (141) 400771
75.2%

Most occurring characters

ValueCountFrequency (%)
0 2332131
48.7%
1 744777
 
15.5%
H 532614
 
11.1%
3 338064
 
7.1%
2 209978
 
4.4%
5 159172
 
3.3%
8 115013
 
2.4%
9 105322
 
2.2%
4 94604
 
2.0%
7 86627
 
1.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4793526
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2332131
48.7%
1 744777
 
15.5%
H 532614
 
11.1%
3 338064
 
7.1%
2 209978
 
4.4%
5 159172
 
3.3%
8 115013
 
2.4%
9 105322
 
2.2%
4 94604
 
2.0%
7 86627
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4793526
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2332131
48.7%
1 744777
 
15.5%
H 532614
 
11.1%
3 338064
 
7.1%
2 209978
 
4.4%
5 159172
 
3.3%
8 115013
 
2.4%
9 105322
 
2.2%
4 94604
 
2.0%
7 86627
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4793526
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2332131
48.7%
1 744777
 
15.5%
H 532614
 
11.1%
3 338064
 
7.1%
2 209978
 
4.4%
5 159172
 
3.3%
8 115013
 
2.4%
9 105322
 
2.2%
4 94604
 
2.0%
7 86627
 
1.8%

hierarchy5_id
Categorical

HIGH CARDINALITY 

Distinct292
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
H0001021012
 
10979
H0000040501
 
10941
H0003160922
 
9724
H0002000926
 
7881
H0000040001
 
6976
Other values (287)
486113 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters5858754
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowH0000030901
2nd rowH0003160922
3rd rowH0003110017
4th rowH0003140912
5th rowH0003110906

Common Values

ValueCountFrequency (%)
H0001021012 10979
 
2.1%
H0000040501 10941
 
2.1%
H0003160922 9724
 
1.8%
H0002000926 7881
 
1.5%
H0000040001 6976
 
1.3%
H0106011307 6823
 
1.3%
H0106011422 6376
 
1.2%
H0003120012 6334
 
1.2%
H0000030001 6038
 
1.1%
H0312110917 5896
 
1.1%
Other values (282) 454646
85.4%

Length

2024-07-11T16:19:50.774630image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
h0001021012 10979
 
2.1%
h0000040501 10941
 
2.1%
h0003160922 9724
 
1.8%
h0002000926 7881
 
1.5%
h0000040001 6976
 
1.3%
h0106011307 6823
 
1.3%
h0106011422 6376
 
1.2%
h0003120012 6334
 
1.2%
h0000030001 6038
 
1.1%
h0312110917 5896
 
1.1%
Other values (282) 454646
85.4%

Most occurring characters

ValueCountFrequency (%)
0 2640566
45.1%
1 1022351
 
17.4%
H 532614
 
9.1%
3 414634
 
7.1%
2 410687
 
7.0%
5 176809
 
3.0%
6 157237
 
2.7%
7 139239
 
2.4%
8 127899
 
2.2%
4 124501
 
2.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5858754
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2640566
45.1%
1 1022351
 
17.4%
H 532614
 
9.1%
3 414634
 
7.1%
2 410687
 
7.0%
5 176809
 
3.0%
6 157237
 
2.7%
7 139239
 
2.4%
8 127899
 
2.2%
4 124501
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5858754
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2640566
45.1%
1 1022351
 
17.4%
H 532614
 
9.1%
3 414634
 
7.1%
2 410687
 
7.0%
5 176809
 
3.0%
6 157237
 
2.7%
7 139239
 
2.4%
8 127899
 
2.2%
4 124501
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5858754
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2640566
45.1%
1 1022351
 
17.4%
H 532614
 
9.1%
3 414634
 
7.1%
2 410687
 
7.0%
5 176809
 
3.0%
6 157237
 
2.7%
7 139239
 
2.4%
8 127899
 
2.2%
4 124501
 
2.1%

storetype_id
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
ST04
470568 
ST03
62046 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2130456
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowST03
2nd rowST03
3rd rowST03
4th rowST03
5th rowST03

Common Values

ValueCountFrequency (%)
ST04 470568
88.4%
ST03 62046
 
11.6%

Length

2024-07-11T16:19:50.999782image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:51.217895image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
st04 470568
88.4%
st03 62046
 
11.6%

Most occurring characters

ValueCountFrequency (%)
S 532614
25.0%
T 532614
25.0%
0 532614
25.0%
4 470568
22.1%
3 62046
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 532614
25.0%
T 532614
25.0%
0 532614
25.0%
4 470568
22.1%
3 62046
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 532614
25.0%
T 532614
25.0%
0 532614
25.0%
4 470568
22.1%
3 62046
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 532614
25.0%
T 532614
25.0%
0 532614
25.0%
4 470568
22.1%
3 62046
 
2.9%

store_size
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
45
267115 
31
203453 
13
62046 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1065228
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row13
2nd row13
3rd row13
4th row13
5th row13

Common Values

ValueCountFrequency (%)
45 267115
50.2%
31 203453
38.2%
13 62046
 
11.6%

Length

2024-07-11T16:19:51.404688image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:51.628836image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
45 267115
50.2%
31 203453
38.2%
13 62046
 
11.6%

Most occurring characters

ValueCountFrequency (%)
4 267115
25.1%
5 267115
25.1%
3 265499
24.9%
1 265499
24.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1065228
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 267115
25.1%
5 267115
25.1%
3 265499
24.9%
1 265499
24.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1065228
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 267115
25.1%
5 267115
25.1%
3 265499
24.9%
1 265499
24.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1065228
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 267115
25.1%
5 267115
25.1%
3 265499
24.9%
1 265499
24.9%

city_id_old
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
C006
532614 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2130456
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC006
2nd rowC006
3rd rowC006
4th rowC006
5th rowC006

Common Values

ValueCountFrequency (%)
C006 532614
100.0%

Length

2024-07-11T16:19:51.828434image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:52.061853image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
c006 532614
100.0%

Most occurring characters

ValueCountFrequency (%)
0 1065228
50.0%
C 532614
25.0%
6 532614
25.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1065228
50.0%
C 532614
25.0%
6 532614
25.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1065228
50.0%
C 532614
25.0%
6 532614
25.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2130456
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1065228
50.0%
C 532614
25.0%
6 532614
25.0%

country_id
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
Turkey
532614 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters3195684
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTurkey
2nd rowTurkey
3rd rowTurkey
4th rowTurkey
5th rowTurkey

Common Values

ValueCountFrequency (%)
Turkey 532614
100.0%

Length

2024-07-11T16:19:52.235455image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:52.442595image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
turkey 532614
100.0%

Most occurring characters

ValueCountFrequency (%)
T 532614
16.7%
u 532614
16.7%
r 532614
16.7%
k 532614
16.7%
e 532614
16.7%
y 532614
16.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3195684
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
T 532614
16.7%
u 532614
16.7%
r 532614
16.7%
k 532614
16.7%
e 532614
16.7%
y 532614
16.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3195684
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
T 532614
16.7%
u 532614
16.7%
r 532614
16.7%
k 532614
16.7%
e 532614
16.7%
y 532614
16.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3195684
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
T 532614
16.7%
u 532614
16.7%
r 532614
16.7%
k 532614
16.7%
e 532614
16.7%
y 532614
16.7%

city_code
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
Konya
532614 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters2663070
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKonya
2nd rowKonya
3rd rowKonya
4th rowKonya
5th rowKonya

Common Values

ValueCountFrequency (%)
Konya 532614
100.0%

Length

2024-07-11T16:19:52.613689image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:53.180606image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
konya 532614
100.0%

Most occurring characters

ValueCountFrequency (%)
K 532614
20.0%
o 532614
20.0%
n 532614
20.0%
y 532614
20.0%
a 532614
20.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
K 532614
20.0%
o 532614
20.0%
n 532614
20.0%
y 532614
20.0%
a 532614
20.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
K 532614
20.0%
o 532614
20.0%
n 532614
20.0%
y 532614
20.0%
a 532614
20.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2663070
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
K 532614
20.0%
o 532614
20.0%
n 532614
20.0%
y 532614
20.0%
a 532614
20.0%

day
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.746554
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:53.366825image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.7800467
Coefficient of variation (CV)0.55758528
Kurtosis-1.1904873
Mean15.746554
Median Absolute Deviation (MAD)8
Skewness0.0046516988
Sum8386835
Variance77.08922
MonotonicityNot monotonic
2024-07-11T16:19:53.616637image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20 17599
 
3.3%
19 17596
 
3.3%
23 17595
 
3.3%
14 17575
 
3.3%
21 17573
 
3.3%
18 17572
 
3.3%
24 17568
 
3.3%
22 17568
 
3.3%
11 17566
 
3.3%
16 17566
 
3.3%
Other values (21) 356836
67.0%
ValueCountFrequency (%)
1 17011
3.2%
2 17439
3.3%
3 17443
3.3%
4 17435
3.3%
5 17441
3.3%
6 17445
3.3%
7 17453
3.3%
8 17476
3.3%
9 17520
3.3%
10 17542
3.3%
ValueCountFrequency (%)
31 10123
1.9%
30 16059
3.0%
29 16041
3.0%
28 17552
3.3%
27 17531
3.3%
26 17525
3.3%
25 17561
3.3%
24 17568
3.3%
23 17595
3.3%
22 17568
3.3%

weekday
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
Mon
76399 
Sat
76213 
Fri
76084 
Sun
76084 
Thu
75975 
Other values (2)
151859 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1597842
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMon
2nd rowMon
3rd rowMon
4th rowMon
5th rowMon

Common Values

ValueCountFrequency (%)
Mon 76399
14.3%
Sat 76213
14.3%
Fri 76084
14.3%
Sun 76084
14.3%
Thu 75975
14.3%
Wed 75972
14.3%
Tue 75887
14.2%

Length

2024-07-11T16:19:53.864334image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:54.158966image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
mon 76399
14.3%
sat 76213
14.3%
fri 76084
14.3%
sun 76084
14.3%
thu 75975
14.3%
wed 75972
14.3%
tue 75887
14.2%

Most occurring characters

ValueCountFrequency (%)
u 227946
14.3%
n 152483
9.5%
S 152297
9.5%
T 151862
 
9.5%
e 151859
 
9.5%
M 76399
 
4.8%
o 76399
 
4.8%
a 76213
 
4.8%
t 76213
 
4.8%
F 76084
 
4.8%
Other values (5) 380087
23.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
u 227946
14.3%
n 152483
9.5%
S 152297
9.5%
T 151862
 
9.5%
e 151859
 
9.5%
M 76399
 
4.8%
o 76399
 
4.8%
a 76213
 
4.8%
t 76213
 
4.8%
F 76084
 
4.8%
Other values (5) 380087
23.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
u 227946
14.3%
n 152483
9.5%
S 152297
9.5%
T 151862
 
9.5%
e 151859
 
9.5%
M 76399
 
4.8%
o 76399
 
4.8%
a 76213
 
4.8%
t 76213
 
4.8%
F 76084
 
4.8%
Other values (5) 380087
23.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
u 227946
14.3%
n 152483
9.5%
S 152297
9.5%
T 151862
 
9.5%
e 151859
 
9.5%
M 76399
 
4.8%
o 76399
 
4.8%
a 76213
 
4.8%
t 76213
 
4.8%
F 76084
 
4.8%
Other values (5) 380087
23.8%

season
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
3
150328 
2
145810 
1
138489 
4
97987 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters532614
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
3 150328
28.2%
2 145810
27.4%
1 138489
26.0%
4 97987
18.4%

Length

2024-07-11T16:19:54.406168image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-11T16:19:54.637068image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
3 150328
28.2%
2 145810
27.4%
1 138489
26.0%
4 97987
18.4%

Most occurring characters

ValueCountFrequency (%)
3 150328
28.2%
2 145810
27.4%
1 138489
26.0%
4 97987
18.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 532614
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 150328
28.2%
2 145810
27.4%
1 138489
26.0%
4 97987
18.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 532614
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 150328
28.2%
2 145810
27.4%
1 138489
26.0%
4 97987
18.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 532614
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 150328
28.2%
2 145810
27.4%
1 138489
26.0%
4 97987
18.4%

week
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.308542
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.1 MiB
2024-07-11T16:19:54.872337image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q113
median25
Q337
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.331339
Coefficient of variation (CV)0.56626489
Kurtosis-1.0633578
Mean25.308542
Median Absolute Deviation (MAD)12
Skewness0.1050274
Sum13479684
Variance205.38727
MonotonicityNot monotonic
2024-07-11T16:19:55.162856image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30 11503
 
2.2%
32 11501
 
2.2%
31 11493
 
2.2%
38 11483
 
2.2%
33 11482
 
2.2%
37 11474
 
2.2%
39 11446
 
2.1%
29 11437
 
2.1%
28 11422
 
2.1%
35 11372
 
2.1%
Other values (43) 418001
78.5%
ValueCountFrequency (%)
1 7162
1.3%
2 10603
2.0%
3 10667
2.0%
4 10657
2.0%
5 10737
2.0%
6 10792
2.0%
7 10826
2.0%
8 10794
2.0%
9 10726
2.0%
10 10888
2.0%
ValueCountFrequency (%)
53 3185
0.6%
52 7547
1.4%
51 7566
1.4%
50 7515
1.4%
49 7453
1.4%
48 7348
1.4%
47 7455
1.4%
46 7518
1.4%
45 7438
1.4%
44 7402
1.4%

holiday
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size520.3 KiB
False
375463 
True
157151 
ValueCountFrequency (%)
False 375463
70.5%
True 157151
29.5%
2024-07-11T16:19:55.417327image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

month_name
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
Jul
50655 
Aug
50646 
May
49588 
Sep
49027 
Mar
48677 
Other values (7)
284021 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1597842
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJan
2nd rowJan
3rd rowJan
4th rowJan
5th rowJan

Common Values

ValueCountFrequency (%)
Jul 50655
9.5%
Aug 50646
9.5%
May 49588
9.3%
Sep 49027
9.2%
Mar 48677
9.1%
Apr 48274
9.1%
Jun 47948
9.0%
Jan 46650
8.8%
Feb 43162
8.1%
Dec 33306
6.3%
Other values (2) 64681
12.1%

Length

2024-07-11T16:19:55.607264image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jul 50655
9.5%
aug 50646
9.5%
may 49588
9.3%
sep 49027
9.2%
mar 48677
9.1%
apr 48274
9.1%
jun 47948
9.0%
jan 46650
8.8%
feb 43162
8.1%
dec 33306
6.3%
Other values (2) 64681
12.1%

Most occurring characters

ValueCountFrequency (%)
u 149249
 
9.3%
J 145253
 
9.1%
a 144915
 
9.1%
e 125495
 
7.9%
A 98920
 
6.2%
M 98265
 
6.1%
p 97301
 
6.1%
r 96951
 
6.1%
n 94598
 
5.9%
c 66120
 
4.1%
Other values (12) 480775
30.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
u 149249
 
9.3%
J 145253
 
9.1%
a 144915
 
9.1%
e 125495
 
7.9%
A 98920
 
6.2%
M 98265
 
6.1%
p 97301
 
6.1%
r 96951
 
6.1%
n 94598
 
5.9%
c 66120
 
4.1%
Other values (12) 480775
30.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
u 149249
 
9.3%
J 145253
 
9.1%
a 144915
 
9.1%
e 125495
 
7.9%
A 98920
 
6.2%
M 98265
 
6.1%
p 97301
 
6.1%
r 96951
 
6.1%
n 94598
 
5.9%
c 66120
 
4.1%
Other values (12) 480775
30.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1597842
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
u 149249
 
9.3%
J 145253
 
9.1%
a 144915
 
9.1%
e 125495
 
7.9%
A 98920
 
6.2%
M 98265
 
6.1%
p 97301
 
6.1%
r 96951
 
6.1%
n 94598
 
5.9%
c 66120
 
4.1%
Other values (12) 480775
30.1%

Interactions

2024-07-11T16:19:26.069525image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:54.697376image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:57.843542image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:01.574940image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:05.435918image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:08.493445image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:11.900583image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:15.586061image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:19.656679image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:22.962589image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:26.382285image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:55.012685image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:58.163958image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:02.178509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:05.740322image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:08.804972image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:12.208274image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:16.048023image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:19.961674image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:23.278351image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:26.668209image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:55.338892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:58.457087image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:02.617995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:06.049276image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:09.140543image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:12.518914image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:16.523610image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:20.293890image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:23.582942image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:26.977914image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:55.645766image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:58.755952image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:03.045731image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:06.351570image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:09.451202image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:12.833560image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:16.949257image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:20.600769image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:23.898976image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:27.292904image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:55.948835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:59.056462image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:03.512210image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:06.631044image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:09.762121image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:13.138666image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:17.414993image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:20.902170image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:24.226468image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:27.604923image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:56.288168image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:59.388466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:03.861856image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:06.965093image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:10.082954image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:13.472304image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:17.917633image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:21.222344image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:24.552376image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:27.911275image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:56.600563image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:59.811905image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:04.204303image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:07.274591image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:10.606292image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:13.852656image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:18.346680image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:21.514816image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:24.854940image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:28.263958image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:56.928969image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:00.282344image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:04.517465image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:07.585310image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:10.931027image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:14.281004image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:18.672637image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:21.839846image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:25.185992image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:28.650460image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:57.237875image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:00.738503image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:04.818841image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:07.876936image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:11.265279image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:14.740838image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:18.997113image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:22.147758image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:25.466213image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:29.051575image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:18:57.546436image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:01.178819image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:05.137316image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:08.193379image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:11.589908image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:15.171831image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:19.341115image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:22.671992image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-07-11T16:19:25.758492image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-07-11T16:19:55.850609image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Unnamed: 0cluster_iddayhierarchy1_idhierarchy2_idhierarchy3_idholidaymonth_namepriceproduct_depthproduct_lengthproduct_widthpromo_bin_1promo_bin_2promo_discount_2promo_discount_type_2promo_type_1promo_type_2revenuesalesseasonstockstore_idstore_sizestoretype_idweekweekday
Unnamed: 01.0000.0840.0140.1020.1510.2320.0040.1100.112-0.0280.065-0.0160.0760.0630.0680.1060.0490.007-0.053-0.0560.104-0.0171.0001.0001.0000.0680.000
cluster_id0.0841.0000.0020.2480.3060.5350.0010.016-0.2640.079-0.014-0.1030.1420.4920.3890.4830.0640.0180.1120.1170.0260.1260.1000.1000.1280.0060.000
day0.0140.0021.0000.0000.0000.0000.0570.0340.004-0.0000.002-0.0000.0330.3500.2660.4200.0120.024-0.003-0.0020.0170.0010.0000.0000.0000.0930.028
hierarchy1_id0.1020.2480.0001.0001.0001.0000.0000.0140.4660.1750.0400.1600.1770.4610.5220.6420.1010.011-0.212-0.2250.011-0.2230.1230.1230.1360.0080.000
hierarchy2_id0.1510.3060.0001.0001.0001.0000.0000.0160.4300.119-0.0620.1520.2680.9260.7830.9130.0960.032-0.173-0.1830.025-0.2060.1810.1810.2100.0080.000
hierarchy3_id0.2320.5350.0001.0001.0001.0000.0000.0330.4240.105-0.0770.1450.4350.9570.8690.9840.1530.051-0.174-0.1840.054-0.2070.2720.2720.3240.0070.000
holiday0.0040.0010.0570.0000.0000.0001.0000.0370.0020.001-0.0000.0010.0450.4340.4380.4700.0350.0120.0530.0510.011-0.0020.0000.0000.0000.0100.978
month_name0.1100.0160.0340.0140.0160.0330.0371.0000.007-0.003-0.001-0.0020.0900.3600.3880.6650.0360.038-0.000-0.0011.0000.0000.0100.0100.0100.3090.030
price0.112-0.2640.0040.4660.4300.4240.0020.0071.0000.2520.2980.1100.0431.0001.0001.0000.0810.000-0.233-0.2610.016-0.3900.0220.0220.0100.0210.000
product_depth-0.0280.079-0.0000.1750.1190.1050.001-0.0030.2521.0000.1930.1670.0920.2510.2270.2650.0400.005-0.008-0.0190.021-0.1160.0620.0620.0620.0010.000
product_length0.065-0.0140.0020.040-0.062-0.077-0.000-0.0010.2980.1931.0000.1500.0700.1080.2610.1130.0310.002-0.069-0.0810.024-0.1940.0540.0540.0530.0020.000
product_width-0.016-0.103-0.0000.1600.1520.1450.001-0.0020.1100.1670.1501.0000.0660.3210.4510.2380.0380.007-0.009-0.0120.019-0.0590.0780.0780.077-0.0030.000
promo_bin_10.0760.1420.0330.1770.2680.4350.0450.0900.0430.0920.0700.0661.0000.9430.7980.8880.4190.0290.0770.0800.0640.1190.0870.0870.1170.0260.032
promo_bin_20.0630.4920.3500.4610.9260.9570.4340.3601.0000.2510.1080.3210.9431.0000.9980.8550.5670.695-0.091-0.0991.000-0.1180.0630.0630.092-0.3690.351
promo_discount_20.0680.3890.2660.5220.7830.8690.4380.3881.0000.2270.2610.4510.7980.9981.0000.7270.4000.7030.0980.1021.0000.0440.0680.0680.1070.4160.259
promo_discount_type_20.1060.4830.4200.6420.9130.9840.4700.6651.0000.2650.1130.2380.8880.8550.7271.0000.3950.7010.2060.2181.0000.2770.1060.1060.134-0.3670.321
promo_type_10.0490.0640.0120.1010.0960.1530.0350.0360.0810.0400.0310.0380.4190.5670.4000.3951.0000.011-0.037-0.0320.0410.0260.0490.0490.0660.0150.019
promo_type_20.0070.0180.0240.0110.0320.0510.0120.0380.0000.0050.0020.0070.0290.6950.7030.7010.0111.000-0.004-0.0040.028-0.0030.0010.0010.000-0.0250.009
revenue-0.0530.112-0.003-0.212-0.173-0.1740.053-0.000-0.233-0.008-0.069-0.0090.077-0.0910.0980.206-0.037-0.0041.0000.9950.0000.1970.0000.0000.0000.0060.002
sales-0.0560.117-0.002-0.225-0.183-0.1840.051-0.001-0.261-0.019-0.081-0.0120.080-0.0990.1020.218-0.032-0.0040.9951.0000.0030.2120.0120.0120.0060.0050.006
season0.1040.0260.0170.0110.0250.0540.0111.0000.0160.0210.0240.0190.0641.0001.0001.0000.0410.0280.0000.0031.000-0.0030.0080.0080.0090.9660.007
stock-0.0170.1260.001-0.223-0.206-0.207-0.0020.000-0.390-0.116-0.194-0.0590.119-0.1180.0440.2770.026-0.0030.1970.212-0.0031.0000.0190.0190.013-0.0040.000
store_id1.0000.1000.0000.1230.1810.2720.0000.0100.0220.0620.0540.0780.0870.0630.0680.1060.0490.0010.0000.0120.0080.0191.0001.0001.0000.0090.000
store_size1.0000.1000.0000.1230.1810.2720.0000.0100.0220.0620.0540.0780.0870.0630.0680.1060.0490.0010.0000.0120.0080.0191.0001.0001.0000.0000.000
storetype_id1.0000.1280.0000.1360.2100.3240.0000.0100.0100.0620.0530.0770.1170.0920.1070.1340.0660.0000.0000.0060.0090.0131.0001.0001.0000.0080.000
week0.0680.0060.0930.0080.0080.0070.0100.3090.0210.0010.002-0.0030.026-0.3690.416-0.3670.015-0.0250.0060.0050.966-0.0040.0090.0000.0081.0000.010
weekday0.0000.0000.0280.0000.0000.0000.9780.0300.0000.0000.0000.0000.0320.3510.2590.3210.0190.0090.0020.0060.0070.0000.0000.0000.0000.0101.000

Missing values

2024-07-11T16:19:30.143612image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-11T16:19:33.341649image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-07-11T16:19:36.465371image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0store_idproduct_iddatesalesrevenuestockpricepromo_type_1promo_bin_1promo_type_2promo_bin_2promo_discount_2promo_discount_type_2product_lengthproduct_depthproduct_widthcluster_idhierarchy1_idhierarchy2_idhierarchy3_idhierarchy4_idhierarchy5_idstoretype_idstore_sizecity_id_oldcountry_idcity_codedayweekdayseasonweekholidaymonth_name
01793963S0030P00152017-01-020.00.004.02.60PR14NaNPR03NaNNaNNaN10.033.010.0cluster_1H00H0000H000003H00000309H0000030901ST0313C006TurkeyKonya2Mon11NJan
11793964S0030P00182017-01-021.01.815.01.95PR14NaNPR03NaNNaNNaN1.014.011.0cluster_4H00H0003H000316H00031609H0003160922ST0313C006TurkeyKonya2Mon11NJan
21793965S0030P00352017-01-022.04.541.02.45PR14NaNPR03NaNNaNNaN3.017.012.5cluster_7H00H0003H000311H00031100H0003110017ST0313C006TurkeyKonya2Mon11NJan
31793966S0030P00512017-01-020.00.0027.00.70PR14NaNPR03NaNNaNNaN1.717.54.5cluster_7H00H0003H000314H00031409H0003140912ST0313C006TurkeyKonya2Mon11NJan
41793967S0030P00552017-01-020.00.0012.03.50PR05verylowPR03NaNNaNNaN2.318.513.5cluster_0H00H0003H000311H00031109H0003110906ST0313C006TurkeyKonya2Mon11NJan
51793968S0030P00572017-01-020.00.004.012.90PR14NaNPR03NaNNaNNaN4.022.09.0cluster_0H01H0108H010807H01080709H0108070901ST0313C006TurkeyKonya2Mon11NJan
61793969S0030P00622017-01-020.00.005.019.90PR14NaNPR03NaNNaNNaNNaNNaNNaNcluster_0H03H0312H031205H03120507H0312050709ST0313C006TurkeyKonya2Mon11NJan
71793970S0030P00992017-01-020.00.005.010.90PR14NaNPR03NaNNaNNaN3.011.310.0cluster_0H03H0313H031302H03130210H0313021001ST0313C006TurkeyKonya2Mon11NJan
81793971S0030P01032017-01-020.00.0013.02.65PR14NaNPR03NaNNaNNaN9.030.09.0cluster_0H00H0000H000003H00000300H0000030001ST0313C006TurkeyKonya2Mon11NJan
91793972S0030P01142017-01-021.00.4220.00.45PR14NaNPR03NaNNaNNaN2.07.59.0cluster_0H00H0003H000312H00031209H0003120906ST0313C006TurkeyKonya2Mon11NJan
Unnamed: 0store_idproduct_iddatesalesrevenuestockpricepromo_type_1promo_bin_1promo_type_2promo_bin_2promo_discount_2promo_discount_type_2product_lengthproduct_depthproduct_widthcluster_idhierarchy1_idhierarchy2_idhierarchy3_idhierarchy4_idhierarchy5_idstoretype_idstore_sizecity_id_oldcountry_idcity_codedayweekdayseasonweekholidaymonth_name
5326048519010S0142P07182019-09-300.00.029.023.75PR14NaNPR03NaNNaNNaN5.018.010.0cluster_0H00H0004H000401H00040100H0004010026ST0431C006TurkeyKonya30Mon340NSep
5326058519011S0142P07212019-09-300.00.06.014.50PR05moderatePR03NaNNaNNaN1.525.012.5cluster_0H01H0108H010809H01080900H0108090001ST0431C006TurkeyKonya30Mon340NSep
5326068519012S0142P07242019-09-300.00.012.07.90PR14NaNPR03NaNNaNNaN6.024.010.0cluster_0H00H0003H000310H00031000H0003100001ST0431C006TurkeyKonya30Mon340NSep
5326078519013S0142P07292019-09-300.00.02.069.90PR14NaNPR03NaNNaNNaN19.023.020.0cluster_0H03H0315H031508H03150800H0315080020ST0431C006TurkeyKonya30Mon340NSep
5326088519014S0142P07312019-09-300.00.018.09.90PR14NaNPR03NaNNaNNaN8.018.08.0cluster_0H03H0314H031400H03140001H0314000101ST0431C006TurkeyKonya30Mon340NSep
5326098519015S0142P07332019-09-300.00.012.00.75PR14NaNPR03NaNNaNNaN2.04.09.0cluster_7H00H0003H000314H00031405H0003140506ST0431C006TurkeyKonya30Mon340NSep
5326108519016S0142P07412019-09-300.00.03.032.90PR10verylowPR03NaNNaNNaN3.816.49.5cluster_0H01H0106H010600H01060013H0106001345ST0431C006TurkeyKonya30Mon340NSep
5326118519017S0142P07422019-09-300.00.05.069.90PR07verylowPR03NaNNaNNaN6.47.06.4cluster_0H01H0108H010811H01081100H0108110038ST0431C006TurkeyKonya30Mon340NSep
5326128519018S0142P07472019-09-300.00.016.021.90PR14NaNPR03NaNNaNNaN23.023.033.3cluster_0H01H0107H010701H01070100H0107010026ST0431C006TurkeyKonya30Mon340NSep
5326138519019S0142P07482019-09-300.00.018.018.90PR14NaNPR03NaNNaNNaN3.84.815.3cluster_0H01H0108H010801H01080110H0108011006ST0431C006TurkeyKonya30Mon340NSep